Goto

Collaborating Authors

 cypher statement


SAKA: An Intelligent Platform for Semi-automated Knowledge Graph Construction and Application

Zhang, Hanrong, Wang, Xinyue, Pan, Jiabao, Wang, Hongwei

arXiv.org Artificial Intelligence

Knowledge graph (KG) technology is extensively utilized in many areas, and many companies offer applications based on KG. Nonetheless, most KG platforms necessitate expertise and tremendous time and effort from users to construct KG records manually, which poses great difficulties for ordinary people. Additionally, audio data is abundant and holds valuable information, but it is challenging to transform it into a KG. What's more, the platforms usually do not leverage the full potential of the KGs constructed by users. In this paper, we propose an intelligent and user-friendly platform for Semi-automated KG Construction and Application (SAKA) to address the aforementioned problems. Primarily, users can semi-automatically construct KGs from structured data of numerous areas by interacting with the platform, based on which multi-versions of KG can be stored, viewed, managed, and updated. Moreover, we propose an Audio-based KG Information Extraction (AGIE) method to establish KGs from audio data. Lastly, the platform creates a semantic parsing-based knowledge base question answering (KBQA) system based on the user-created KGs. We prove the feasibility of the semi-automatic KG construction method on the SAKA platform.


Combining LLMs and Knowledge Graphs to Reduce Hallucinations in Question Answering

Pusch, Larissa, Conrad, Tim O. F.

arXiv.org Artificial Intelligence

Advancements in natural language processing have revolutionized the way we can interact with digital information systems, such as databases, making them more accessible. However, challenges persist, especially when accuracy is critical, as in the biomedical domain. A key issue is the hallucination problem, where models generate information unsupported by the underlying data, potentially leading to dangerous misinformation. This paper presents a novel approach designed to bridge this gap by combining Large Language Models (LLM) and Knowledge Graphs (KG) to improve the accuracy and reliability of question-answering systems, on the example of a biomedical KG. Built on the LangChain framework, our method incorporates a query checker that ensures the syntactical and semantic validity of LLM-generated queries, which are then used to extract information from a Knowledge Graph, substantially reducing errors like hallucinations. We evaluated the overall performance using a new benchmark dataset of 50 biomedical questions, testing several LLMs, including GPT-4 Turbo and llama3:70b. Our results indicate that while GPT-4 Turbo outperforms other models in generating accurate queries, open-source models like llama3:70b show promise with appropriate prompt engineering. To make this approach accessible, a user-friendly web-based interface has been developed, allowing users to input natural language queries, view generated and corrected Cypher queries, and verify the resulting paths for accuracy. Overall, this hybrid approach effectively addresses common issues such as data gaps and hallucinations, offering a reliable and intuitive solution for question answering systems. The source code for generating the results of this paper and for the user-interface can be found in our Git repository: https://git.zib.de/lpusch/cyphergenkg-gui


Represent United Kingdom's public record as a knowledge graph

#artificialintelligence

I love constructing knowledge graphs from various sources. I've wanted to create a government knowledge graph for some time now but was struggling to find any data that is easily accessible and doesn't require me to spend weeks developing a data pipeline. At first, I thought I would have to use OCR and NLP techniques to extract valuable information from public records, but luckily I stumbled upon UK Gazette. The UK Gazette is a website that holds the United Kingdom's official public record information. All the content on the website and via its APIs is available under the Open Government License v3.0.


Optimize fetching data from Neo4j with Apache Arrow

#artificialintelligence

The year is 2022, and graph machine learning is one of the rising trends in data analytics. While Neo4j has a Graph Data Science library that supports multiple graph algorithms and machine learning workflows, sometimes you want to export data from Neo4j and run it through your favorite machine learning frameworks like PyTorch or TensorFlow. In that scenario, you want to be able to export data from Neo4j in a fast and scalable way. But, unfortunately, using the Neo4j Python driver is not the most efficient way of retrieving data. However, no need to worry, Dave Voutila has got your back.